A Data Disclosure Policy for Count Data Based on the COM-Poisson Distribution

نویسندگان

  • Joseph B. Kadane
  • Ramayya Krishnan
  • Galit Shmueli
چکیده

C ount data arise in various organizational settings. When the release of such data is sensitive, organizations need information-disclosure policies that protect data confidentiality while still providing data access. In contrast to extant disclosure policies, we describe a new policy for count tables that is based on disclosing only the sufficient statistics of a flexible discrete distribution. This distribution, the COM-Poisson, well approximates Poisson counts but also under-and over-dispersed counts. The sufficient statistics mask the exact cell counts and often also the table size. Under the scenario of a data holding agency and a data snooper, we show that this policy has low disclosure risk with no loss of data utility: Usually, many count tables correspond to the disclosed sufficient statistics. Furthermore, these count tables are equally likely to be the undisclosed table. Finding these solutions requires solving a system of linear equations, which are underdetermined for tables with more than three cells, and can be computationally prohibitive for even small tables. We also consider cell-specific interval bounds, a commonly used disclosure limitation policy, and compare them to our policy. We describe several types of snooper knowledge, their integration with the disclosed statistics, and implications. Applying this policy to three real data sets, we illustrate the low associated disclosure risk. 1 2 months for 3 revisions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian paradigm for analysing count data in longitudina studies using Poisson-generalized log-gamma model

In analyzing longitudinal data with counted responses, normal distribution is usually used for distribution of the random efffects. However, in some applications random effects may not be normally distributed. Misspecification of this distribution may cause reduction of efficiency of estimators. In this paper, a generalized log-gamma distribution is used for the random effects which includes th...

متن کامل

Characterizing the Performance of the Bayesian Conway-maxwell Poisson Generalized Linear Model

This paper documents the performance of a Bayesian Conway-Maxwell-Poisson (COM-Poisson) generalized linear model (GLM). This distribution was originally developed as an extension of the Poisson distribution in 1962 and has a unique characteristic, in that it can handle both under-dispersed and over-dispersed count data. Previous work by the authors lead to the development of a dual-link GLM bas...

متن کامل

Characterizing the performance of the Conway-Maxwell Poisson generalized linear model.

Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in ris...

متن کامل

A Flexible Regression Model for Count Data

Poisson regression is a popular tool for modeling count data and is applied in a vast array of applications from the social to the physical sciences and beyond. Real data, however, are often overor under-dispersed and, thus, not conducive to Poisson regression. We propose a regression model based on the Conway–Maxwell-Poisson (COM-Poisson) distribution to address this problem. The COM-Poisson r...

متن کامل

Bivariate Conway-Maxwell-Poisson distribution: Formulation, properties, and inference

The bivariate Poisson distribution is a popular distribution for modeling bivariate count data. Its basic assumptions and marginal equi-dispersion, however, may prove limiting in some contexts. To allow for data dispersion, we develop here a bivariate Conway–Maxwell–Poisson (COM–Poisson) distribution that includes the bivariate Poisson, bivariate Bernoulli, and bivariate geometric distributions...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Management Science

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2006